911.csvdplyr, tidyr, xts, lubridate, qtlcharts, forecast, tseries, leaflet, ggplot2, plotly, dygraphs, viridis, graphicsThis is an exploratory analysis for data collected from 911 calls in Montgomery County.
Members of rescue team always risk many people’s lives while they are working. Their work requires great deal of concentration for a very long time, as one little mistake can result in death. Hence, sufficient time for rest and relaxation is extremely important. However, a certain number of rescue workers on call are always necessary because it is impossible to know what kind of accident would happen in the future. We thought, if we can predict and forecast a general number of accidents in the future, it would be easier to allocate proper number of rescue workers at the right time. Which may also lead to a proper amount of rest time for them.
The source of the 911.csv data is “http://montcoalert.org/”. This page provides information of 911 calls in Montgomery County. Montgomery County is located in Commonwealth of Pennsylvania. The data are collected from Dec 10, 2015 to Oct 25, 2016.
# Load packages
library(dplyr) # Data manipulation
library(tidyr) # Data manipulation
library(xts) # Creating xts's object
library(lubridate) #Helps dealing with dates a little easier
library(qtlcharts) # Making plot of correlation table
library(forecast) # Time series Forecast
library(tseries) # Time series analysis
library(leaflet) # Visualization
library(ggplot2) # Visualization
library(plotly) # Visualization
library(dygraphs) # Visualization
library(viridis) # Colormap
library(graphics) # Mosaic plotNow that our packages are loaded, let’s read in and check the attributes of data.
# Check data
call <- read.csv("C:\\Users\\hufs\\Desktop\\EDA\\911.csv")
knitr::kable(head(call,n=3))| lat | lng | desc | zip | title | timeStamp | twp | addr | e |
|---|---|---|---|---|---|---|---|---|
| 40.29788 | -75.58129 | REINDEER CT & DEAD END; NEW HANOVER; Station 332; 2015-12-10 @ 17:10:52; | 19525 | EMS: BACK PAINS/INJURY | 2015-12-10 17:40:00 | NEW HANOVER | REINDEER CT & DEAD END | 1 |
| 40.25806 | -75.26468 | BRIAR PATH & WHITEMARSH LN; HATFIELD TOWNSHIP; Station 345; 2015-12-10 @ 17:29:21; | 19446 | EMS: DIABETIC EMERGENCY | 2015-12-10 17:40:00 | HATFIELD TOWNSHIP | BRIAR PATH & WHITEMARSH LN | 1 |
| 40.12118 | -75.35198 | HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-Station:STA27; | 19401 | Fire: GAS-ODOR/LEAK | 2015-12-10 17:40:00 | NORRISTOWN | HAWS AVE | 1 |
str(call) # 123884 obs. of 9 variables## 'data.frame': 123884 obs. of 9 variables:
## $ lat : num 40.3 40.3 40.1 40.1 40.3 ...
## $ lng : num -75.6 -75.3 -75.4 -75.3 -75.6 ...
## $ desc : Factor w/ 123847 levels " ; ; 2016-03-09 @ 05:15:47;",..: 87476 13086 47179 3373 19732 16789 58508 22429 64201 12311 ...
## $ zip : int 19525 19446 19401 19401 NA 19446 19044 19426 19438 19462 ...
## $ title : Factor w/ 117 levels "EMS: ABDOMINAL PAINS",..: 10 21 88 16 23 38 46 54 62 115 ...
## $ timeStamp: Factor w/ 91782 levels "2015-12-10 17:40:00",..: 1 1 1 2 2 2 2 2 2 2 ...
## $ twp : Factor w/ 69 levels "","ABINGTON",..: 36 20 37 37 30 23 21 49 32 42 ...
## $ addr : Factor w/ 24150 levels "",".","10TH AVE",..: 17261 2302 9195 569 3713 3088 11374 4272 12392 2084 ...
## $ e : int 1 1 1 1 1 1 1 1 1 1 ...
These are the names, class type of variables. We can also check first few observations. In total, there are 123884 observations and 9 variables. Simple description of the variables is as follows. :
| Variable Name | Description |
|---|---|
| lat | Latitude |
| lng | Longitude |
| desc | Description of the Emergency Call (EMS: Emergency Medical Service, Fire: Fire Accident, Traffic: Traffic Accident) |
| zip | Zipcode |
| title | Title |
| timeStamp | YYYY-MM-DD HH:MM:SS |
| twp | Township |
| addr | Address |
| e | Dummy variable (always 1) |
We had to handle the data. We created some new variables from existing variables and removed variables that we did not need.
calls <- separate(call,title,c("Types","Subtypes"),sep=":")
# Separate Title into Types and Subtypes.
calls$timeStamp <- as.POSIXlt(call$timeStamp)
calls$Date <- as.Date(calls$timeStamp)
calls$Year<-factor(year(calls$Date))
calls$Month<-factor(month(calls$Date))
calls$Day<-factor(day(calls$Date))
calls$Hour <- factor(calls$timeStamp$hour)
calls$zip<-factor(calls$zip)
calls$Subtypes<-factor(calls$Subtypes)
calls$Types<-factor(calls$Types)
# Handling timestamp variable and change the classes of some variables into factor.
calls<-calls[,-10]
# Erase unnecessary dummy variable, e, as it is always 1.
knitr::kable(head(calls,n=3))| lat | lng | desc | zip | Types | Subtypes | timeStamp | twp | addr | Date | Year | Month | Day | Hour |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 40.29788 | -75.58129 | REINDEER CT & DEAD END; NEW HANOVER; Station 332; 2015-12-10 @ 17:10:52; | 19525 | EMS | BACK PAINS/INJURY | 2015-12-10 17:40:00 | NEW HANOVER | REINDEER CT & DEAD END | 2015-12-10 | 2015 | 12 | 10 | 17 |
| 40.25806 | -75.26468 | BRIAR PATH & WHITEMARSH LN; HATFIELD TOWNSHIP; Station 345; 2015-12-10 @ 17:29:21; | 19446 | EMS | DIABETIC EMERGENCY | 2015-12-10 17:40:00 | HATFIELD TOWNSHIP | BRIAR PATH & WHITEMARSH LN | 2015-12-10 | 2015 | 12 | 10 | 17 |
| 40.12118 | -75.35198 | HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-Station:STA27; | 19401 | Fire | GAS-ODOR/LEAK | 2015-12-10 17:40:00 | NORRISTOWN | HAWS AVE | 2015-12-10 | 2015 | 12 | 10 | 17 |
str(calls) # 123884 obs. of 13 variables## 'data.frame': 123884 obs. of 14 variables:
## $ lat : num 40.3 40.3 40.1 40.1 40.3 ...
## $ lng : num -75.6 -75.3 -75.4 -75.3 -75.6 ...
## $ desc : Factor w/ 123847 levels " ; ; 2016-03-09 @ 05:15:47;",..: 87476 13086 47179 3373 19732 16789 58508 22429 64201 12311 ...
## $ zip : Factor w/ 105 levels "17752","18036",..: 103 83 70 70 NA 83 40 76 79 88 ...
## $ Types : Factor w/ 3 levels "EMS","Fire","Traffic": 1 1 2 1 1 1 1 1 1 3 ...
## $ Subtypes : Factor w/ 84 levels " ABDOMINAL PAINS",..: 10 22 38 16 25 42 50 60 69 78 ...
## $ timeStamp: POSIXlt, format: "2015-12-10 17:40:00" "2015-12-10 17:40:00" ...
## $ twp : Factor w/ 69 levels "","ABINGTON",..: 36 20 37 37 30 23 21 49 32 42 ...
## $ addr : Factor w/ 24150 levels "",".","10TH AVE",..: 17261 2302 9195 569 3713 3088 11374 4272 12392 2084 ...
## $ Date : Date, format: "2015-12-10" "2015-12-10" ...
## $ Year : Factor w/ 2 levels "2015","2016": 1 1 1 1 1 1 1 1 1 1 ...
## $ Month : Factor w/ 11 levels "1","2","3","4",..: 11 11 11 11 11 11 11 11 11 11 ...
## $ Day : Factor w/ 31 levels "1","2","3","4",..: 10 10 10 10 10 10 10 10 10 10 ...
## $ Hour : Factor w/ 24 levels "0","1","2","3",..: 18 18 18 18 18 18 18 18 18 18 ...
call_map <- leaflet() %>% addTiles() %>% setView(lng=-75.2, lat=40.1, zoom=10) %>% addMarkers(data=calls, lng= ~lng, lat= ~lat, clusterOptions = markerClusterOptions())
call_mapDifferent types of accidents require different types of experts. We checked the percentage of each three types of calls by pie chart.
freqt.calls <-as.data.frame(table(calls$Type))
freqt.calls## Var1 Freq
## 1 EMS 60939
## 2 Fire 18651
## 3 Traffic 44294
plot_ly(freqt.calls,labels=~Var1,values=~Freq,type='pie',
textposition='inside',
textinfo='label+percent',
insidetextfont=list(color='#FFFFFF',size=20),
hoverinfo='text',
marker=list(colors=c('rgb(0,204,102)','rgb(211,94,96)','rgb(51,153,255)'),
line=list(color='#FFFFFF',width=1)),
showlegend = FALSE)%>%
layout(title="Frequency of Three Types of 911 calls",
xaxis=list(showgrid=F,zeroline=F,showticklabels=F),
yaxis=list(showgrid=F,zeroline=F,showticklabels=F))We were curious whether number of calls for EMS would be higher than the other two variables, even within a certain period of time. We used the time series graph to figure it out.
fre.calls.ems <-as.data.frame(table(calls[calls$Types=="EMS",]$Date))
fre.calls.fire <-as.data.frame(table(calls[calls$Types=="Fire",]$Date))
fre.calls.traffic <-as.data.frame(table(calls[calls$Types=="Traffic",]$Date))
fre.calls.ems$Var1 <- as.Date(fre.calls.ems$Var1)
fre.calls.fire$Var1 <- as.Date(fre.calls.fire$Var1)
fre.calls.traffic$Var1 <- as.Date(fre.calls.traffic$Var1)
# Convert between character representations and objects of class "Date"
ems.ts <- xts(fre.calls.ems$Freq, fre.calls.ems$Var1)
fire.ts <- xts(fre.calls.fire$Freq, fre.calls.fire$Var1)
traffic.ts <- xts(fre.calls.traffic$Freq, fre.calls.traffic$Var1)
# Creating an extensible time-series object
names(ems.ts) <- "EMS"
names(fire.ts) <- "FIRE"
names(traffic.ts) <- "TRAFFIC" # Specify a column name
ts.types <- cbind(ems.ts, fire.ts, traffic.ts)
dygraph(ts.types, main="Three Types of 911 calls") %>%
dyRangeSelector() %>% # Choosing a range becomes available
dyOptions(colors=c('rgb(0,204,102)','rgb(211,94,96)','rgb(51,153,255)')) # Allocates each line's colorsIn order to see correlations between three types, we made a correlation matrix. By mousing over on a square box, we can see a correlation coefficient and by clicking it, we can see a scatter plot.
month <- as.data.frame(calls %>% select(-timeStamp) %>% group_by(Month, Types) %>% summarise(N=n()) %>% spread(Types, N))
day <- as.data.frame(calls %>% select(-timeStamp) %>% group_by(Day, Types) %>% summarise(N=n()) %>% spread(Types, N))
hour <- as.data.frame(calls %>% select(-timeStamp) %>% group_by(Hour, Types) %>% summarise(N=n()) %>% spread(Types, N))
iplotCorr(month[,2:4], reorder=TRUE, chartOpts=list(cortitle="Month", scattitle="Scatterplot"))iplotCorr(day[,2:4], reorder=TRUE, chartOpts=list(cortitle="Day", scattitle="Scatterplot"))iplotCorr(hour[,2:4], reorder=TRUE, chartOpts=list(cortitle="Hour", scattitle="Scatterplot"))Big accident might have occured in 23, Jan. We compared the number of accidents happened in 22nd and 23rd by hours. If there were a critical accident in 23rd, then a number of 911 calls for traffic accidents at a specific time would be significantly large.
nr<-c()
nn<-c()
for (i in 0:23){
nr[i+1]<-nrow(calls[calls$Date=="2016-01-23" & calls$Types=="Traffic"&calls$Hour==i,])
nn[i+1]<-nrow(calls[calls$Date=="2016-01-22" & calls$Types=="Traffic"&calls$Hour==i,])
}
nrn<-0:23
dnr<-data.frame(time=nrn,value=nr)
dnn<-data.frame(time=nrn,value=nn)
dnr$name<-"2016-01-23"
dnn$name<-"2016-01-22"
dd<-rbind(dnr,dnn)
ggplot(dd, aes(time, value, fill = name)) + geom_bar(position = "dodge",stat="identity")+labs(title="What Happened in 23 Jan, 2016?",x="Hour",y="Traffic Freq")+theme_bw()+scale_fill_discrete(name="Date")month$Month<-c(1:10,12)
month<-month[c(11,1:10),]
plot_ly(month, x = ~Month, y = ~EMS, type = 'bar', name = 'EMS') %>%
add_trace(y = ~Traffic, name = 'Traffic') %>% add_trace(y = ~Fire, name = 'Fire') %>%
layout(title="911 Calls -Monthly", yaxis = list(title = 'Count'), barmode = 'stack' ,xaxis=list(type = "category", categoryorder = "array",
categoryarray = c(12,1:10)))plot_ly(day, x = ~Day, y = ~EMS, type = 'bar', name = 'EMS') %>%
add_trace(y = ~Traffic, name = 'Traffic') %>% add_trace(y = ~Fire, name = 'Fire') %>%
layout(title="911 Calls -Daily", yaxis = list(title = 'Count'), barmode = 'stack')plot_ly(hour, x = ~Hour, y = ~EMS, type = 'bar', name = 'EMS') %>%
add_trace(y = ~Traffic, name = 'Traffic') %>% add_trace(y = ~Fire, name = 'Fire') %>%
layout(title="911 Calls - Hourly", yaxis = list(title = 'Count'), barmode = 'stack')date_calls<-unique(calls$Date)
thurs<-date_calls[seq(from=1,to=321,by=7)]
fris<-date_calls[seq(from=2,to=321,by=7)]
sats<-date_calls[seq(from=3,to=321,by=7)]
suns<-date_calls[seq(from=4,to=321,by=7)]
mons<-date_calls[seq(from=5,to=321,by=7)]
tues<-date_calls[seq(from=6,to=321,by=7)]
weds<-date_calls[seq(from=7,to=321,by=7)]
calls$Week[is.element(calls$Date,thurs)==TRUE]<-"Thursday"
calls$Week[is.element(calls$Date,fris)==TRUE]<-"Friday"
calls$Week[is.element(calls$Date,sats)==TRUE]<-"Saturday"
calls$Week[is.element(calls$Date,suns)==TRUE]<-"Sunday"
calls$Week[is.element(calls$Date,mons)==TRUE]<-"Monday"
calls$Week[is.element(calls$Date,tues)==TRUE]<-"Tuesday"
calls$Week[is.element(calls$Date,weds)==TRUE]<-"Wednesday"
calls$Week<-as.factor(calls$Week)
levels(calls$Week)<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")
str(calls$Week)## Factor w/ 7 levels "Sunday","Monday",..: 5 5 5 5 5 5 5 5 5 5 ...
week <- as.data.frame(calls %>% select(-timeStamp) %>% group_by(Week, Types) %>% summarise(N=n()) %>% spread(Types, N))
plot_ly(week, x = ~Week, y = ~EMS, type = 'bar', name = 'EMS') %>%
add_trace(y = ~Traffic, name = 'Traffic') %>% add_trace(y = ~Fire, name = 'Fire') %>%
layout(title="911 Calls -Weekly", yaxis = list(title = 'Count'), barmode = 'stack')A heat map is a three-dimensional representation of data in which values are represented by colors. Let’s see heatmaps of 911 calls by Month, Day and Hour. The number of 911 calls gets larger as a color of square goes closer to yellow.
day_hour <- as.data.frame(calls[, c("Day", "Hour")] %>% group_by(Day, Hour) %>% summarise(N = n()))
ggplotly(ggplot(day_hour, aes(Day, Hour, fill = N)) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Day and Hour"))month_hour <- as.data.frame(calls[, c("Month", "Hour")] %>% group_by(Month, Hour) %>% summarise(N = n()))
levels(month_hour$Month) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Dec")
ggplotly(ggplot(month_hour, aes(Month,Hour, fill = N)) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Month and Hour"))In the following Heatmap, There are blanks in February, April, June, September because they do not consist of 31 days. Blanks in October and December are because there are less data in these months, as explained above.
day_hour <- as.data.frame(calls[, c("Day", "Month")] %>% group_by(Day, Month) %>% summarise(N = n()))
levels(day_hour$Month) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Dec")
ggplotly(ggplot(day_hour, aes(Day,Month, fill = N)) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Day and Hour"))town_type <- as.data.frame(calls %>% select(Types, twp))
freq.town <- as.data.frame(table(calls$twp)) %>% arrange(desc(Freq))
top<-as.character(freq.town[1:10,1])
top_town<-town_type[is.element(town_type$twp,top),]
topab<-c("LM", "ABING", "NORRIS", "UM", "CHEL", "POTT", "UML", "LP", "PLY", "HOR")
top_town$twp <- as.character(top_town$twp)
top_town$twp<-factor(top_town$twp,levels=top)
levels(top_town$twp)<-topab
table_1 <- with(top_town, table(twp,Types))
mosaicplot(table_1, main="Mosaic Plot by Types and Top 10 townships",color=c('light green','tomato','light blue'),cex.axis =1,off=3,border="white",xlab="Township",ylab="Type")LM : LOWER MERIONABING : ABINGTONNORRIS : NORRISTOWNUM : UPPER MERIONCHEL : CHELTENHAMPOTT : POTTSTOWNUML : UPPER MORELANDLP : LOWER PROVIDENCEPLY : PLYMOUTHHOR : HORSHAMems.subtype <- as.data.frame(table(calls[calls$Types=="EMS",]$Subtypes)) %>% arrange(desc(Freq))
ems.subtype$Var1<-paste(ems.subtype$Var1,"E",sep=" - ")
fire.subtype <- as.data.frame(table(calls[calls$Types=="Fire",]$Subtypes)) %>% arrange(desc(Freq))
fire.subtype$Var1<-paste(fire.subtype$Var1,"F",sep=" - ")
traffic.subtype <- as.data.frame(table(calls[calls$Types=="Traffic",]$Subtypes)) %>% arrange(desc(Freq))
traffic.subtype$Var1<-paste(traffic.subtype$Var1,"T")
freq.sub <- rbind(ems.subtype,fire.subtype,traffic.subtype)%>%arrange(desc(Freq))
freq.sub %>% head(5) %>% ggplot(aes(reorder(Var1,Freq), Freq)) + geom_bar(stat = "identity", aes(fill=Var1)) + coord_flip() + theme_bw() + ggtitle("911 Calls") + xlab("Top 5 Subtypes") + ylab("Frequency") + theme(legend.position = "none") + geom_text(aes(label=Freq, y= Freq + 2*sign(Freq),size=2)) ems.sub <- as.data.frame(calls[calls$Types=="EMS",] %>% select(Subtypes, twp))
ems.sub$Subtypes <- paste(ems.sub$Subtypes,"E",sep=" - ")
fire.sub <- as.data.frame(calls[calls$Types=="Fire",] %>% select(Subtypes, twp))
fire.sub$Subtypes <- paste(fire.sub$Subtypes,"F",sep=" - ")
traffic.sub <- as.data.frame(calls[calls$Types=="Traffic",] %>% select(Subtypes, twp))
traffic.sub$Subtypes <- paste(traffic.sub$Subtypes,"T")
mosaic.sub <- rbind(ems.sub, fire.sub, traffic.sub)
top <- as.character(freq.town[1:10,1]) # Top 10 township
topp <- as.character(freq.sub[1:5,1]) # Top 5 Subtypes
mosaic.sub <- mosaic.sub[is.element(mosaic.sub$twp,top),]
mosaic.sub <- mosaic.sub[is.element(mosaic.sub$Subtypes,topp),]
topabc <- c("VA-T","DV-T","FA-F","RE-E","CE-E")
topab <- c("LM", "ABING", "NORRIS", "UM", "CHEL", "POTT", "UML", "LP", "PLY", "HOR")
mosaic.sub$twp <- as.character(mosaic.sub$twp)
mosaic.sub$Subtypes <- as.character(mosaic.sub$Subtypes)
mosaic.sub$twp <- factor(mosaic.sub$twp,levels=top)
mosaic.sub$Subtypes <- factor(mosaic.sub$Subtypes,levels=topp)
levels(mosaic.sub$twp)<-topab
levels(mosaic.sub$Subtypes)<-topabc
table_2 <- with(mosaic.sub, table(twp,Subtypes))
mosaicplot(table_2, las=1,main="Mosaic Plot by Top 5 Subtypes and Top 10 townships", cex.axis =1,off=3,border="white",xlab="Township",ylab="Subtype",color=c('orange','sky blue','green','steelblue4','tomato1'))VA-T : VEHICLE ACCIDENT (Traffic)DV-T : DISABLED VEHICLE (Traffic)FA-F : FIRE ALARM (Fire)RE-E : RESPIRATORY EMERGENCY (EMS)CE-E : CARDIAC EMERGENCY (EMS)monthhour.sub <- as.data.frame(calls[, c("Month", "Hour","twp")] %>% group_by(Month, Hour,twp) %>% summarise(N = n()))
top<-as.character(freq.town[1:10,1])
monthhour.sub <- monthhour.sub[is.element(monthhour.sub$twp,top),]
levels(monthhour.sub$Month) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Dec")
ggplot(monthhour.sub, aes(Month, Hour, fill = N)) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Month and Hour") + facet_wrap(~twp, ncol=5) + theme(legend.position="bottom") dayhour.sub <- as.data.frame(calls[, c("Day", "Hour","twp")] %>% group_by(Day, Hour,twp) %>% summarise(N = n()))
top<-as.character(freq.town[1:10,1])
dayhour.sub <- dayhour.sub[is.element(dayhour.sub$twp,top),]
ggplot(dayhour.sub, aes(Day, Hour, fill = N)) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Day and Hour") + facet_wrap(~twp, ncol=5) + theme(legend.position="bottom")ggplot(monthhour.sub, aes(Month, Hour, fill = sqrt(N) )) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Month and Hour") + facet_wrap(~twp, ncol=5) + theme(legend.position="bottom") ggplot(dayhour.sub, aes(Day, Hour, fill = sqrt(N))) + geom_tile(color = "white", size = 0.1) + scale_fill_viridis(name="Number of Calls") + coord_equal() + labs(title="911 Calls by Day and Hour") + facet_wrap(~twp, ncol=5) + theme(legend.position="bottom")Now let’s forecast the number of accidents that might happen in the future. Time-series analysis is used for analyzation.
First, we computed autocorrelation function and partial autocorrelation function to check briefly whether the data are stationary time-series.
Second, we checked whether differences are necessary by using ndiffs function.
Third, we did Dickey-Fuller test for null-hypothesis, data are non-stationary. We could reject null-hypothesis.
Lastly, we tried to find the best ARIMA model by their AIC, AICc and BIC values.
forecasted_ems <- forecast(fitted_ems,15)
forecasted_fire <- forecast(fitted_fire,15)
forecasted_traffic <- forecast(fitted_traffic,15)
# forecasting from time series
op <- par(mfrow = c(3,1),
mar = c(2,4,1,2)+.1, pty='m')
plot(forecasted_ems, main="Forecast: Emergency Medical Service", xlab="Time",xaxt = "n")
axis(1, at=1:51, labels=seq(1,357,by=7))
plot(forecasted_fire, main="Forecast: Fire", xlab="Time",xaxt="n")
axis(1, at=1:51, labels=seq(1,357,by=7))
plot(forecasted_traffic, main="Forecast: Traffic", xlab="Time",xaxt="n")
axis(1, at=1:51, labels=seq(1,357,by=7))par(op)Date <- c("2016-10-26", "2016-10-27", "2016-10-28", "2016-10-29", "2016-10-30", "2016-10-31", "2016-11-01")
EMS <- round(as.data.frame(forecast(fitted_ems,7))[,1],2)
Fire <- round(as.data.frame(forecast(fitted_fire,7))[,1],2)
Traffic <- round(as.data.frame(forecast(fitted_traffic,7))[,1],2)
forecast_day <- data.frame(Date, EMS, Fire, Traffic)
knitr::kable(forecast_day)| Date | EMS | Fire | Traffic |
|---|---|---|---|
| 2016-10-26 | 185.46 | 44.36 | 106.18 |
| 2016-10-27 | 191.32 | 54.11 | 126.79 |
| 2016-10-28 | 191.29 | 55.05 | 147.67 |
| 2016-10-29 | 189.47 | 56.07 | 147.29 |
| 2016-10-30 | 186.79 | 57.90 | 118.06 |
| 2016-10-31 | 193.85 | 59.03 | 141.82 |
| 2016-11-01 | 183.12 | 61.27 | 125.71 |
It’s impossible to predict frequency of 911 calls using this data.
We should do Baysian analysis to predict the accidents.
We ought to collect more variables that are related closely to the occurence of each types of accidents(such as floating population, weather, and etc.) for Bayesian analysis. This data is not enough.
By analyzing this data, we were highly surprised once again by rescue workers’ devotion to the society. Most of the work has its busy season and off-season. We figured out that ,for rescue workers, every day was busy season. Even at night, some of them have to be on night duty. I want to claim that, we must have a deep respect for them. Furthermore, the state should provide them with high quality of welfare and benefit.
All types of accidents are laregly correlated with each others. What we have to focus on, is putting our effort in preventing accidents from happening, as it may result in more additional accidents.
* Writer: Cho Sungin, Jang Yoonseo
* Creation date: Nov 9, 2016